Two-class signal segmentation for speech/music detection in audio tracks

نویسندگان

  • Mouhamadou Seck
  • Frédéric Bimbot
  • Didier Zugaj
  • Bernard Delyon
چکیده

We present a technique for the segmention of a sound track into two classes of segments. Each frame of signal is preprocessed by extracting cepstral coefficients and their first order derivatives. For each class, the distribution of the frame parameter vectors is modeled by a Gaussian Mixture Model (GMM). GMM order is selected using two criteria : the Minimum Description Length (MDL) criterion and the Akaı̈ke Information Criterion (AIC). Frame score is based on a weighted loglikelihood ratio in a window around the frame. Decision for each frame is taken by comparing its score to a threshold. Experiments are presented on speech / music segmentation in audio tracks. In these experiments, the MDL criterion leads to a reasonable GMM order. Using the MDL criterion for GMM order selection, frame classification error rate is around 20%. However, using GMMs with much lower orders, only decreases marginally performances.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Audio content analysis for online audiovisual data segmentation and classification

While current approaches for audiovisual data segmentation and classification are mostly focused on visual cues, audio signals may actually play a more important role in content parsing for many applications. An approach to automatic segmentation and classification of audiovisual data based on audio content analysis is proposed. The audio signal from movies or TV programs is segmented and class...

متن کامل

A wavelet-based parameterization for speech/music segmentation

The problem of speech/music discrimination is a challenging research problem which significantly impacts Automatic Speech Recognition (ASR) performance. This paper proposes new features for the Speech/Music discrimination task. We propose to use a decomposition of the audio signal based on wavelets, which allows a good analysis of non stationary signal like speech or music. We compute different...

متن کامل

Robust singing detection in speech/music discriminator design

In this paper, an approach for robust signing signal detection in speech/music discrimination is proposed and applied to applications of audio indexing. Conventional approaches in speech/music discrimination can provide reasonable performance with regular music signals but often perform poorly with singing segments. This is due mainly to the fact that speech and singing signals are extremely cl...

متن کامل

Audio signal segmentation and classification using fuzzy c-means clustering

This paper proposes an audio signal segmentation and classification method using fuzzy c-means clustering. Recently, high performance of the audio signal segmentation and classification is required for audio-visual indexing because of the popular use of the Internet, higher bandwidth access to the network, widespread of digital recording and storage; and several methods have been proposed. They...

متن کامل

A new approach for audio classification and segmentation using Gabor wavelets and Fisher linear discriminator

Rapid increase in the amount of audio data demands an efficient method to automatically segment or classify audio stream based on its content. In this paper, based on the Gabor wavelet features, an audio classification and segmentation method is proposed. This method will first divide an audio stream into clips, each of which contains one-second audio information. Then, each clip is classified ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999